See Chapter 2 for a refresher on mathematical notation and formulas, including how to interpret the
various forms of the summation symbol ∑ (the Greek capital sigma). In the rest of this chapter, we use
the simplest form, meaning the form without the i subscripts that refer to specific elements of an array,
whenever possible.
Some statistical books use the notation such that capital
and capital N refer to census
parameters, and lowercase versions of those to refer to sample statistics. In this book, we make it
clear each time we present this notation whether we are talking about a census or a sample.
Median
Like the mean, the median is a common measure of central tendency. In fact, it could be argued that the
median is the only one of the three that really takes the word central seriously.
The median of a sample is the middle value in the sorted (ordered) set of numbers. By
definition, half of the numbers are smaller than the median, and half are larger. The median of a
population frequency distribution function (like the curves shown in Figure 9-2) divides the total
area under the curve into two equal parts: Half of the area under the curve (AUC) lies to the left
of the median, and half lies to the right.
Consider the sample of diastolic blood pressure (DBP) measurements from seven study participants
from the preceding section. If you arrange the values in order from lowest to highest mmHg, you can
list them as 84, 84, 89, 91, 110, 114, and 116. There are seven values, and 91 is the fourth of the seven
sorted values, so that is the median. Three DBPs in the sample are smaller than 91 mmHg, and three
are larger than 91 mmHg. If you have an even number of values, the median is the average of the two
middle values. So imagine that you add a value of 118 mmHg to the top of your list, so you now have
eight values. To get the median, you would make an average of the fourth and fifth value, which would
be (91 + 110)/2 = 100.5 mmHg (don’t be thrown off by the 0.5).
Statisticians often say that they prefer the median to the mean because the median is much less strongly
influenced by extreme outliers than the mean. For example, if the largest value for DBP had been very
high — such as 150 mmHg instead of 116 mmHg — the mean would have jumped from 98.3 mmHg up
to 103.1 mmHg. But in the same case, the median would have remained unchanged at 91. Here’s an
even more extreme example: If a multibillionaire were to move into a certain state, the mean family
net worth in that state might rise by hundreds of dollars, but the median family net worth would
probably rise by only a few cents (if it were to rise at all). This is why you often hear the median
rather than mean income in reports comparing income across regions.
Mode